That wraps up our quick review of the C# language features that allow LINQ to work its magic. However, why have LINQ in the first place? Well, as software developers, it is hard to deny that the vast majority of our programming time is spent obtaining and manipulating data. When speaking of “data,” it is very easy to immediately envision information contained within relational databases. However, another popular location in which data exists is within XML documents (*.config files, locally persisted DataSets, or in-memory data returned from WCF services).
Data can be found in numerous places beyond these two common homes for information. For instance, say you have an array or generic List<T> type containing 300 integers, and you want to obtain a subset that meets a given criterion (e.g., only the odd or even members in the container, only prime numbers, only nonrepeating numbers greater than 50). Or perhaps you are making use of the reflection APIs and need to obtain only metadata descriptions for each class deriving from a particular parent class within an array of Types. Indeed, data is everywhere.
Prior to .NET 3.5, interacting with a particular flavor of data required programmers to make use of very diverse APIs. Consider, for example, Table 13-1, which illustrates several common APIs used to access various types of data (I’m sure you can think of many other examples).
Table 13-1. Ways to Manipulate Various Types of Data
The Data You Want | How to Obtain it |
---|---|
Relational data | System.Data.dll, System.Data.SqlClient.dll, etc. |
XML document data | System.Xml.dll |
Metadata tables | The System.Reflection namespace |
Collections of objects | System.Array and the System.Collections/System.Collections.Generic namespaces |
Of course, nothing is wrong with these approaches to data manipulation. In fact, when programming with .NET 4.0/C# 2010, you can (and will) certainly make direct use of ADO.NET, the XML namespaces, reflection services, and the various collection types. However, the basic problem is that each of these APIs is an island unto itself, which offers very little in the way of integration. True, it is possible (for example) to save an ADO.NET DataSet as XML, and then manipulate it via the System.Xml namespaces, but nonetheless, data manipulation remains rather asymmetrical.
The LINQ API is an attempt to provide a consistent, symmetrical manner in which programmers can obtain and manipulate “data” in the broad sense of the term. Using LINQ, you are able to create directly within the C# programming language constructs called query expressions. These query expressions are based on numerous query operators that have been intentionally designed to look and feel very similar (but not quite identical) to a SQL expression.
The twist, however, is that a query expression can be used to interact with numerous types of data— even data that has nothing to do with a relational database. Strictly speaking, “LINQ” is the term used to describe this overall approach to data access. However, based on where you are applying your LINQ queries, you will encounter various terms, such as the following:
To be sure, Microsoft seems quite dedicated to integrating LINQ support deeply within the .NET programming environment. As time goes on, it would be very safe to bet that LINQ will become an integral part of the .NET base class libraries, languages, and Visual Studio itself.
It is also very important to point out that a LINQ query expression (unlike a traditional SQL statement) is strongly typed. Therefore, the C# compiler will keep us honest and make sure that these expressions are syntactically well formed. On a related note, query expressions have metadata representation within the assembly that makes use of them, as the C# LINQ query operators always make a rich underlying object model. Tools such as Visual Studio 2010 can use this metadata for useful features such as IntelliSense, autocompletion, and so forth.
As mentioned in Chapter 2, the New Project dialog of Visual Studio 2010 has the option of selecting which version of the .NET platform you wish to compile against. When you opt to compile against .NET 3.5 or higher, each of the project templates will automatically reference the key LINQ assemblies, which can be viewed using the Solution Explorer. Table 13-2 documents the role of the key LINQ assemblies. However, you will encounter additional LINQ libraries over the remainder of this book.
Table 13-2. Core LINQ-centric Assemblies
Assembly | Meaning in Life |
---|---|
System.Core.dll | Defines the types that represent the core LINQ API. This is the one assembly you must have access to if you wish to use any LINQ API, including LINQ to Objects. |
ystem.Data.DataSetExtensions.dll | Defines a handful of types to integrate ADO.NET types into the LINQ programming paradigm (LINQ to DataSet). |
System.Xml.Linq.dll | Provides functionality for using LINQ with XML document data (LINQ to XML). |
In order to work with LINQ to Objects, you must make sure that every C# code file that contains LINQ queries imports the System.Linq namespace (defined within System.Core.dll). If you do not do so, you will run into a number of problems. As a very good rule of thumb, if you see a compiler error looking similar to this:
Error 1 Could not find an implementation of the query pattern for source type 'int[]'. 'Where' not found. Are you missing a reference to 'System.Core.dll' or a using directive for 'System.Linq'?
The chances are extremely good that your C# file does not have the following using directive (and believe me, I speak from experience!):
using System.Linq;